System for obtaining correct byte addresses by XOR-ING 2 LSB bits of byte address with binary 3 to facilitate compatibility between computer architecture having different memory orders

ABSTRACT

A method and apparatus for enabling a computer to run using either a Big Endian or Little Endian architecture is provided. The method and apparatus use the fact that XORing the lower two bits of a byte address in one architecture with a binary 3 converts that byte address to the equivalent byte address in the other architecture. The conversion method and apparatus is implemented in hardware by setting a bit in a status register indicating a Big Endian or Little Endian architecture in conjunction with an XOR gate which couples the byte address to binary 3. The conversion method and apparatus is implemented in software by scanning the instructions of the input for load and store instructions. The software modifies the instructions by taking the contents of the register and XORing the two least significant bits of the byte address with a binary 3.

This is a continuation of application Ser. No. 07/564,923, filed Aug. 9,1990, now abandoned.

BACKGROUND OF THE INVENTION

This invention is in the field of digital computers. In a preferredembodiment, a method and apparatus for enabling a computer to runprograms which utilize either of at least two different byte orders isdisclosed.

It is known that different computer systems organize the words of datain their memories differently. Some computers store words of data withthe least significant bit residing at the lowest address. Such machineshave the so-called "Little Endian" ("LE") architecture. Other computersstore data with the most significant bit (or, in some cases, the signbit) residing at the lowest address. These machines have a "Big Endian"("BE") architecture. Numerous articles describe these data organizationsystems in greater detail. One such article is Cohen, "On Holy Wars anda Plea for Peace," Computer, 10/81, pp. 48-54.

Whether a machine is BE or LE makes little difference within thatparticular machine. Although each architecture has its proponents, theconsensus appears to be that both architectures are equallyadvantageous.

The use of two architectures presents a problem when machines ofdifferent architectures must interact or when software written for onetype of machine is run on a machine of a different type. In thesesituations, data or programs stored in one machine according to onearchitecture would be misinterpreted by a machine of a differentarchitecture. Additionally, instructions which access or manipulateparts of data words will have greatly different results.

The incompatibility between little- and big-endian machines hasgenerated numerous attempts to improve interoperability. Most of theseattempts have resulted in a hardware apparatus, usually comprised of acombination of shift registers and various logic gates. Although suchhardware may allow both BE and LE instructions to operate, the hardwareadds to the computer systems' complexity and reduces its speed. Both ofthese results are undesirable.

SUMMARY OF THE INVENTION

The present invention, in one of its embodiments, corrects the knowndisadvantages of the prior art by providing a software module,compatible with most microprocessors, which allows a BE program to runon a LE machine and vice versa.

The module is part of the operating system software and utilizes thefact that for computer words containing four bytes of data, XORing thelow two bits of the address of one word with binary 3 results inaddressing the desired byte for a machine of opposite byte order.Although the discussion of the invention herein only describes itsoperation with 4 byte words, the invention could be used with wordshaving a different number of bytes.

Both a software and hardware implementation of the present invention arepossible. Although a software implementation requires no additional orspecial hardware, it does run slower than a hard-wired apparatus toperform the same function. Therefore, in some circumstances, a hardwareimplementation of the present invention would be desirable. In thehardware implementation, a bit in one register is set or unset toindicate big or little endianess. The bit indicates that the endianessof certain part-word instructions must be changed.

The invention is described in detail below with reference to thefollowing illustrations in which:

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the process of loading a word from memory;

FIG. 2 shows how a "Load Byte" instruction recalls different bytesdepending on the type of machine running the instruction;

FIG. 3 shows how byte order can be transposed using an XOR instruction;

FIG. 4 shows how changing the offset of a machine can change the natureof the machine;

FIG. 5 shows how transferring data from a file depends on byte order;

FIG. 6 shows how loads for one type of machine can be converted intoloads for a different type of machine;

FIG. 7 shows how to swap the memory locations of the bytes in a word;and

FIG. 8 is a chart showing the possible machine overhead required toemulate byte orders.

DESCRIPTION OF THE SPECIFIC EMBODIMENT(S)

The present invention allows a single computer to run programs intendedfor either BE or LE order. A simple technique called endian switchingallows one computer to mimic a computer of different byte order with aminor performance overhead.

Most computers are word addressed machines. Data is accessed in words,not parts of words, under average conditions. Two machines of differentbyte order are quite similar with respect to words. Data is stored inwords and machine operation is the same between big and little endianmachines with regard to operations using or referencing words. A programthat only references words would run on a machine of either byte order.

This principle is demonstrated by the load word instruction 10 shown inFIG. 1, which instruction is for a LE machine. The same load wordinstruction would function in the identical manner if a BE machine wasused. In this description, an address 12 will be represented by a pairof symbols comprising a word address 12a followed by a two bit byteaddress 12b .

The difference between BE and LE machines is shown herein by theinterpretation of the lower two bits of the address when using bytememory operations. A LE machine selects the 8 bits that include theleast significant bit ("LSB") when the address has 00 as the lower twobits. A BE machine selects the 8 bits that contains the most significantbits ("MSB") when using the same address bits. In some BE machines, thebyte selected would include the sign bit for the data.

FIG. 2 shows load byte operations 14 and 16 for a BE and LE machine,respectively. In the figure, the word (18 or 20) that contains thedesired byte is extracted from memory using only the word address (22 or24) and is placed in an alignment register (26 or 28). In both machines,a reset-time set₋₋ endian flag, which is initialized during power-uproutines, is used to indicate what type of machine is being used andwhich 8 bits are to be placed in the LSB position in register (34 or36). As shown in the figure, the BE memory places the sign and mostsignificant bit into the lowest byte address 38 and the LE memory placesthe least significant bit in the lowest byte address 40.

FIG. 3 illustrates the principle that by using the two low order addressbits to number the bytes starting from either the least significant bit("LSB") or most significant bit ("MSB"), either a BE or LE machine canbe created. In the illustrated example, byte 42 addressed by the low twobits 00 for the BE machine contains an `M` and the two high bits 11address a byte 44 containing an `S`. For the LE machine, the situationis exactly reversed--the byte 44 addressed by 00 contains an `S` and thebyte 42 addressed by 11 contains an `M`. As illustrated, exclusiveOR-ing (`XORing`) the byte address used by either machine with "11"(binary 3) results in the correct byte address to access the same datausing the opposite data ordering convention. FIG. 4 further shows howthe two types of machine are really quite similar in that merelychanging the offset of a given address can make a given machine act likea machine of a different byte order. It should also be noted that theoffsets for different machines are, in this example, mirror images ofeach other.

Initially, treating BE and LE machines as identical with respect towords may appear incorrect. In general, it is known that both a BE andLE machine can read from the same ASCII file, with problems occurringwhen the machines exchange word length binary data. When consecutivebytes are written to a file the results for both types of machine followexpectations. However, when the set of bytes is treated as a singleword, the results are often incorrect. These discrepancies can beexplained by knowing that each machine's Input/Output (`I/O`) systemknows how to disassemble words into a string of bytes. All I/Ooperations are character oriented. Each system and machine has builtinto it the means to assemble and disassemble words as part of its I/Oapparatus. FIG. 5 shows a string of characters 48 in a file and howthose characters 48 are mapped into a word 50 by a BE machine word 52and a by a LE machine. As the figure illustrates, it is the combinationof how the processor extracts bytes from a word and how the I/O systemassembles and disassembles words that establish whether a machine is BEor LE.

Emulation of Byte Order

As the previous discussion indicates, proper byte order treatment iscontrolled by two things: the interpretation of the low address bits andthe I/O rules for translating between a byte stream and a word inmemory. If both these issues can be addressed in either hardware orsoftware, a machine can be created which runs a program in either byteorder.

The first convention, the interpretation of the low address bits, isimplemented in software as part of the computer in a preferredembodiment of the present invention and relies on the observation ofthat byte addresses in one byte order can be made to simulate those ofthe other byte order by XORing the two low order bits of the first byteorder with binary three. The software to perform this task functions ator after the program's compilation time. For example, the computercontaining the present invention for inverting the byte order can readthe program to be loaded and adjust each partial word memory referenceto account for the different convention of how bytes are packed in aword.

Three standard types of byte addressing instructions must beaccommodated in a conversion program to change or rearrange byte orderconsistently. These cases are illustrated in FIG. 6.

In case 1, it is known that the base register is always aligned on aword. In this case, the calculation of the new base address is performedby the system compiler, which merely XORs the offset with binary 3 togive the proper displacement.

In case 2, there is zero or no offset, and the base alignment isunknown. When bytes are addressed in this manner, the value in the baseregister is XORed with binary 3. The result is used as the correctoffset.

Where neither the base alignment or the offset alignment is known case3, the three instructions shown are used to address the correct byte inthe new order. These instructions run very fast, so performing the extrainstructions hardly impacts the system's performance.

All three cases, as stated, deal with the proper interpretation of thelow address bits. Although the preferred embodiment of the presentinvention uses software to accomplish these address conversions, ahardware modification to the system processor could accomplish the sameaim. Some processors permit the byte order of the processor to be set atsystem reset time. In this case, it is relatively simple to adjust themachine's byte order on a program (task) to program basis, using a usermode program. The easiest way to do this is to define a bit in thestatus register that indicates whether to run the new process in thesame or different byte order as the byte order specified in either theprevious program or at the last reset. Setting this flag wouldeffectuate, for the duration of the new program, a universal shift inbyte order. When this technique is implemented in hardware, the systemnotes that the bit indicating a different "endianess" has been turnedon. Instructions which access parts of words are not altered. Instead,the 32 bit address bus has a branch pathway. 30 bits of the address areplaced in a load aligner register. The remaining two bits are XORed withbinary 3 to obtain the correct byte address. The 30-bits and 2-bits arethen recombined in the load aligner. This method does not degrade thesystem's performance as the program is operating exactly as it wouldhave had the reset time flag been set to the indicated byte order. Theseoperations will only occur if the system is both in user mode and thereverse endian bit is on. The instructions affected include the load andstore instructions.

The second convention, conversion of words into and from byte streams inthe I/O system, is also accomplished in software in the preferredembodiment. By uniformly word swapping (as shown in FIG. 7) every buffer60 of data that is read or written to, the machine's built-in conventioncan be accommodated. Word swapping is the operation of exchanging thetwo outer bytes 61 and 64 and the two inner bytes 62 and 63 (i.e. switchbyte 61 with byte 64, and switch byte 62 with byte 63). The code 66necessary to implement word swapping is also shown in FIG. 7.

Implementation of an Endian Switcher

Several alternate methods of implementing both described conventions arepossible. As many machines will not have the hardware modificationsdiscussed herein for byte order emulation, pure software emulation willalso be considered.

If we assume that the computer on which both LE and BE programs are torun has no hardware modifications, a conversion program could be used.The program would word swap both the program and the initialized data.Also, the code would need to be "rewritten" to address partial wordsproperly. The modifications could be done when the program is enteredinto the machine and the altered program can be stored in its translatedform.

In an environment comprising a network of dissimilar byte ordermachines, if we wish to execute any program on any machine, the byteorder cannot be fixed prior to execution. In this case, the program willbe modified as part of its execution, the conversion occurring as theprogram is moved into the system memory. The execution time of theconversion program is small, and the amount of work to modify partialword memory operations is also relatively small. If the machine has theearlier noted hardware modification to interpret the low two bits in anaddress, the only program modification needed is to word swap theprogram and initialized data. If the word swapping is delayed until theapplication program is actually loaded (word swapping on demand) themachine overhead required for the conversion process is amortized overthe entire running time of the program.

Differences between BE and LE machines related to the serialization ofbytes of data in the I/O system are handled when data is read from orwritten to the output buffer. All the data in the output buffer is wordswapped just after reading or just before writing the physical I/O fromor to the disk.

Operating Systems Support

The operating system must be able both to emulate the opposite byteorder and to emulate the system assumptions of the program being run.This can require both a compatible system call vector and, possibly,conversion of the data structure being passed. Every binary load modulehas its byte order encoded at the beginning of the module. This informsthe operating system of the actions needed to emulate the program'ssystem assumptions.

Similarities between UNIX systems make many of the operating systeminterfaces of other systems relatively easy to support. If there areincompatible elements in the system call vector, these can be dealt within several ways. The operating system can identify a binary number of adifferent "endianess" and select a separate vector. If there are no oronly a few overlaps in the assigned system calls, the entry to theopposite endian system call can change the data structure being passed(if any) and branch into the original system call.

Programs are stored as files in a byte order that assumes that theinput/output mechanism converts the character data into words.Therefore, the organization of the program must be changed before theprogram is executed. As described, this is relatively easy, as everyinstruction in the load module and every initialed data word will beswapped. One reason the bytes are switched is because the networkinterface is byte oriented and intrinsically swaps words when movingfrom BE to LE and back. Compensation is thus needed.

FIG. 8 shows an estimate of the impact of simulating the opposite byteorder in software. As earlier stated, the hardware extension incurs noperformance cost during the computational part of the program as theprogram runs exactly as it would on a machine of the other byte order.As shown in FIG. 8, the range of the software penalty is 2-8.8%.Converting the input/output buffers will also cause a certain loss ofperformance. This is on the order of 11.7% longer to read and word swapa file than to just read the file.

Conclusion

The software shown and discussed in the description enables a program tobe run on either an LE or BE machine. Data exchange between systems ofdifferent byte orders, including data stored on disk or memory is notwithin the specific teachings of this invention, but the teaching hereinwould be helpful in solving the problems presented by those transfers.

The present invention has now been described in detail, in the contextof a specific embodiment. Nothing herein should be taken to limit thisinvention to the particular embodiment discussed, as numerous variationsand modifications are possible while remaining within the scope of thisdisclosure. Given these possibilities, this invention should not beconsidered in a narrow, restrictive sense, but rather in a broad,expansive sense.

We claim:
 1. A method for converting a program designed to be executedon a computer system employing a first memory order to a program whichis executable on a computer system employing a second memory order, thesecond memory order being the reverse of the first memory order, themethod comprising the steps of:(a) finding all instructions in theprogram which operate on bytes of data, each of said bytes of datahaving a byte address, each byte address having two least significantbits; (b) combining the two least significant bits of each byte addresswith binary three using an exclusive-OR logic function, therebygenerating two complementary bits for each byte address; and (c)replacing the two least significant bits of each byte address with thetwo complementary bits, thereby generating a new byte address for eachof said bytes of data.
 2. The method of claim 1 further comprising thestep of detecting whether the program is designed to be executed on acomputer system employing the first memory order.
 3. The method of claim2 further comprising the step of detecting the two least significantbits of each byte address.
 4. The method of claim 1 wherein the firstmemory order comprises big endian order, and the second memory ordercomprises little endian order.
 5. The method of claim 1 wherein thefirst memory order comprises little endian order, and the second memoryorder comprises big endian order.
 6. A computer system employing a firstmemory order which converts and executes programs designed to beexecuted on computer systems employing a second memory order, the secondmemory order being the reverse of the first memory order,comprising:means for finding all instructions in the program whichoperate on bytes of data, each of said bytes of data having a byteaddress, each byte address having two least significant bits; means forcombining the two least significant bits of each byte address withbinary three, using an exclusive-OR logic function, thereby generatingtwo complementary bits for each byte address; and means for replacingthe two least significant bits of each byte address with the twocomplementary bits, thereby generating a new byte address for each ofsaid bytes of data.
 7. The computer system of claim 2 furthercomprising:means for detecting whether the program is designed to beexecuted on a computer system employing a second memory order; and meansfor detecting the two least significant bits of each byte address. 8.The computer system of claim 2 wherein the first memory order comprisesbit endian order, and the second memory order comprises little endianorder.
 9. The computer system of claim 2 wherein the first memory ordercomprises little endian order, and the second memory order comprises bigendian order.